Using STFT real and imaginary parts of modulation signals for MMSE-based speech enhancement
نویسندگان
چکیده
In this paper we investigate an alternate, RI-modulation (R = real, I = imaginary) AMS framework for speech enhancement, in which the real and imaginary parts of the modulation signal are processed in secondary AMS procedures. This framework offers theoretical advantages over the previously proposed modulation AMS frameworks in that noise is additive in the modulation signal and noisy acoustic phase is not used to reconstruct speech. Using the MMSE magnitude estimation to modify modulation magnitude spectra, initial experiments presented in this work evaluate if these advantages translate into improvements in processed speech quality. The effect of speech presence uncertainty and log-domain processing on MMSE magnitude estimation in the RI-modulation framework is also investigated. Finally, a comparison of different enhancement approaches applied in the RI-modulation framework is presented. Using subjective and objective experiments as well as spectrogram analysis, we show that RI-modulation MMSE magnitude estimation with speech presence uncertainty produces stimuli which has a higher preference by listeners than the other RI-modulation types. In comparisons to similar approaches in the modulation AMS framework, results showed that the theoretical advantages of the RI-modulation framework did not translate to an improvement in overall quality, with both frameworks yielding very similar sounding stimuli, but a clear improvement (compared to the corresponding modulation AMS based approach) in speech intelligibility was found. 2013 Elsevier B.V. All rights reserved.
منابع مشابه
Speech enhancement using STFT of real and imaginary parts of modulation signals
This paper investigates an alternate modulation (RImodulation) AMS-based framework for speech enhancement, in which real and imaginary parts of the modulation signal are processed in secondary AMS procedures. We propose to apply MMSE magnitude estimation in this framework, and using subjective experiments, show that MMSE RI-modulation magnitude estimation produces stimuli which is preferred by ...
متن کاملSupergaussian Garch Models
In this paper, we introduce supergaussian generalized autoregressive conditional heteroscedasticity (GARCH) models for speech signals in the short-time Fourier transform (STFT) domain. We address the problem of speech enhancement, and show that estimating the variances of the STFT expansion coefficients based on GARCH models yields higher speech quality than by using the decision-directed metho...
متن کاملSpeech spectral modeling and enhancement based on autoregressive conditional heteroscedasticity models
In this paper, we develop and evaluate speech enhancement algorithms, which are based on supergaussian generalized autoregressive conditional heteroscedasticity (GARCH) models in the short-time Fourier transform (STFT) domain. We consider three different statistical models, two fidelity criteria, and two approaches for the estimation of the variances of the STFT coefficients. The statistical mo...
متن کاملEnhancement and Recognition of Reverberant and Noisy Speech by Extending Its Coherence
Most speech enhancement algorithms make use of the short-time Fourier transform (STFT), which is a simple and flexible time-frequency decomposition that estimates the short-time spectrum of a signal. However, the duration of short STFT frames are inherently limited by the nonstationarity of speech signals. The main contribution of this paper is a demonstration of speech enhancement and automati...
متن کاملSpeech Enhancement using Laplacian Mixture Model under Signal Presence Uncertainty
In this paper an estimator for speech enhancement based on Laplacian Mixture Model has been proposed. The proposed method, estimates the complex DFT coefficients of clean speech from noisy speech using the MMSE estimator, when the clean speech DFT coefficients are supposed mixture of Laplacians and the DFT coefficients of noise are assumed zero-mean Gaussian distribution. Furthermore, the MMS...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Speech Communication
دوره 58 شماره
صفحات -
تاریخ انتشار 2014